Discriminative Training of Language Model
نویسندگان
چکیده
We show how discriminative training methods, namely the Maximum Mutual Information and Maximum Discrimination approach, can be adopted for the training of N-gram language models used as clas-siiers working on symbol strings. By estimating the model parameters according to a discriminative objective function instead of Maximum Likelihood, the emphasis is not put on the exact modeling of each class, but on the right classiication of the samples. The methods are shown to be suited for a variety of applications, such as the recognition of regulatory DNA sequences and language identiication. Using phonotactic information, we achieve an error reduction of 10.7% (phoneme sequences) or 41.9% (code-book classes) with respect to the standard ML estimation on a corpus of English and German sentences.
منابع مشابه
Interdependence of Language Models and Discriminative Training
In this paper, the interdependence of language models and discriminative training for large vocabulary speech recognition is investigated. In addition, a constrained recognition approach using word graphs is presented for the efficient determination of alternative word sequences for discriminative training. Experiments have been carried out on the ARPA Wall Street Journal corpus. The recognitio...
متن کاملLanguage Identification and Multilingual Speech Recognition Using Discriminatively Trained Acoustic Models
We perform language identification experiments for four prominent South-African languages using a multilingual speech recognition system. Specifically, we show how successfully Afrikaans, English, Xhosa and Zulu may be identified using a single set of HMMs and a single recognition pass. We further demonstrate the effect of language identification-specific discriminative acoustic model training ...
متن کاملDiscriminative Training and Support V Language Call Ro
In natural language call routing, callers are routed to desired departments based on natural spoken responses to an open-ended “How may I direct your call?” prompt. Natural language call classification can be performed using support vector machines (SVMs) or the popular vector-based model used in information retrieval. We recently demonstrate how discriminative training is powerful to improve a...
متن کاملDiscriminative training of language model classifiers
We show how discriminative training methods, namely the Maximum Mutual Information and Maximum Discrimination approach, can be adopted for the training of N-gram language models used as clas-siiers working on symbol strings. By estimating the model parameters according to a discriminative objective function instead of Maximum Likelihood, the emphasis is not put on the exact modeling of each cla...
متن کاملRisk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised di...
متن کاملDiscriminative Training of GMM for Language Identificatio..
In this paper, a discriminative training procedure for a Gaussian Mixture Model (GMM) language identification system is described. The proposal is based on the Generalized Probabilistic Descent (GPD) algorithm and Minimum Classification Error Rates formulated to estimate the GMM parameters. The evaluation is conducted using the OGI multi-language telephone speech corpus. The experimental result...
متن کامل